Skip to content

pm: migrate label-grouping → project fields + auto-archive PRs + issue skill#129

Closed
hanwencheng wants to merge 1 commit into
mainfrom
pm/issue-metadata-migration
Closed

pm: migrate label-grouping → project fields + auto-archive PRs + issue skill#129
hanwencheng wants to merge 1 commit into
mainfrom
pm/issue-metadata-migration

Conversation

@hanwencheng
Copy link
Copy Markdown
Member

Summary

The project board's "Labels" column was a 5-chip pile that broke group-by views. Moving the single-value categorization into typed project fields, recoloring the multi-value labels per-area, reserving red for human-attention flags, and adding the missing automation pieces: auto-archive closed PRs + a Claude Code skill for creating new issues with the right metadata.

Schema migration (live on litentry/projects/19)

Was Now
priority/p0..p3 labels Priority field {Urgent, High, Medium, Low}
kind/* labels (7) Kind field {Feature, Bug, Research, Docs, Refactor, Security, CI}
phase/v0..v4 labels Milestones (M1..M7)
Single green area/* color (cluttered) 17 distinct non-red colors per area
needs-arch-review, status/blocked, status/investigating, vendor-blocker Recolored red (reserves red for "human needed")
New TEXT field Blocked by for issue-dependency tracking
New label needs-investigation (red)

Migration verified on 40+ issues

#103 Priority=Urgent  Kind=Feature
#107 Priority=Urgent  Kind=Feature  Size=L
#111 Priority=High    Kind=Docs
#116 Priority=Urgent  Kind=Research
#126 Priority=Medium  Kind=Feature

16 obsolete labels deleted from the repo. 38 labels remain (17 area + 5 status + 3 human-attention flags + 2 community + GitHub built-ins).

New automation

  • .github/workflows/pm-auto-archive-closed-pr.yml — on pull_request.closed, archives the PR's item from the project board via archiveProjectV2Item. Built-in "Auto-archive items" only fires after 30 days; this fires on close so active views stay focused on in-flight work.
  • ~/.claude/skills/agentkeys-issue-create/SKILL.md — Claude Code skill that walks operators through Kind/Priority/Size/Area/Milestone/Blocked-by dropdowns and creates the issue with the right labels. Built-in workflows + the sync Action handle board placement + field population automatically.

Script changes

  • setup-project-fields.sh: now also bootstraps Kind + Blocked-by; idempotent rebuild of Priority with new options; cleans Project <Name> zombies as before.
  • sync-fields-from-labels.sh: explicit p0→Urgent / p1→High / p2→Medium / p3→Low mapping table (kind sync uses case-insensitive direct match). Phase sync kept for back-compat until the Phase field is deleted.

Test plan

  • After merge, gh secret set PM_PROJECT_TOKEN already exists — no action needed
  • Optional: trigger sync manually: gh workflow run pm-sync-fields-from-labels.yml --repo litentry/agentKeys
  • Optional: test auto-archive by closing any PR and verifying it disappears from the board's active views
  • Invoke /agentkeys-issue-create from Claude Code to create a test issue; verify it gets the right labels + lands on the board + Kind+Priority fields populate

What's NOT in this PR

  • Phase project field is deprecated but not deleted (still has data on issues). Operator can delete via UI when ready — the script leaves it alone.
  • Estimate project field is deprecated in favor of GitHub's built-in Size field. Same UI-cleanup-later posture.

Labels were cluttering the project board (5-chip stacks per item, no clean
group-by). Move the single-value categorization into GitHub project fields
where the board renders them as clean dropdowns + group-by columns.

## Schema migration

- priority/p0..p3 labels → Priority field {Urgent, High, Medium, Low}
- kind/* labels → Kind field {Feature, Bug, Research, Docs, Refactor, Security, CI}
- phase/v0..v4 labels → use GitHub Milestones (M1..M7) instead
- area/* labels: keep as multi-value labels, recolor each with a distinct
  non-red color (blue/teal/green/purple family) so the board legend reads cleanly
- needs-arch-review, needs-investigation, status/blocked, status/investigating,
  vendor-blocker → recolor red (reserved for "needs human attention")
- New TEXT field "Blocked by" for documenting issue dependencies

## Scripts

- setup-project-fields.sh: idempotent rebuild of Priority with new options,
  creates Kind + Blocked by, leaves deprecated Phase/Estimate untouched (operator
  can delete in UI when ready)
- sync-fields-from-labels.sh: explicit p0→Urgent/p1→High/p2→Medium/p3→Low mapping
  table; adds Kind sync via case-insensitive name match; keeps Phase for back-compat

## New automation

- .github/workflows/pm-auto-archive-closed-pr.yml: on pull_request.closed,
  archive the PR's project board item via archiveProjectV2Item mutation.
  Built-in "Auto-archive items" only archives by age (30+ days); this fires
  on close so the board stays focused.
- ~/.claude/skills/agentkeys-issue-create/SKILL.md: interactive issue creation
  skill that walks operators through Kind/Priority/Size/Area/Milestone/Blocked-by
  dropdowns and creates the issue with the right labels. Built-in workflows
  + sync Action handle board placement and field population.

## Verified live

- 40+ open issues synced: Priority + Kind populated from former labels
- Spot checks: #103 Priority=Urgent Kind=Feature, #126 Priority=Medium Kind=Feature,
  #116 Priority=Urgent Kind=Research, #111 Priority=High Kind=Docs
- 16 obsolete labels deleted from repo (priority/p0..p3, phase/v0..v4, kind/*)
- 38 labels remaining (17 area + 5 status + 3 human-attention + 2 community + GitHub built-ins)
hanwencheng added a commit that referenced this pull request May 24, 2026
…ub native

User feedback after live use of the migration:
- The label→field sync workflow is no longer needed (labels were deleted in
  PR #129; fields are now the source of truth, set via the issue-create skill
  or manually in UI).
- The workflow-drift audit workflow added noise without value (built-in
  workflows rarely drift, and the operator manages them in UI anyway).
- The Blocked-by TEXT project field duplicates GitHub's native issue
  relationships ("Mark as blocked by" / "Mark as blocking" in the UI side
  panel, keyboard `B B` / `B X`). Use the native feature.

## Removed

- .github/workflows/pm-workflow-audit.yml (drift detection — operator handles in UI)
- .github/workflows/pm-sync-fields-from-labels.yml (labels-to-fields sync — labels are gone)
- pm/expected-workflows.json (declarative expectation for the audit)
- pm/scripts/check-workflows.sh (called by the audit)
- pm/scripts/sync-fields-from-labels.sh (called by the sync workflow)
- "Blocked by" project field (deleted via API; setup-project-fields.sh no longer creates it)

## Kept / added

- .github/workflows/pm-auto-archive-closed-pr.yml — auto-archives PRs from the
  board on close (built-in Auto-archive only fires after 30 days)
- pm/scripts/sync-size-from-effort.sh (NEW) — one-shot bulk-populate of the
  Size project field by parsing each issue's "## Effort" body section.
  Idempotent (skips already-sized items). Defaults to M when no parseable
  effort line found.
- ~/.claude/skills/agentkeys-issue-create — updated to:
  - Set Kind/Priority/Size project fields directly via API (replaces deleted
    label-sync workflow)
  - Use GitHub native relationships for blocked-by (replaces removed field)

## Live state after this change

39 open issues all have complete Kind + Priority + Size field values
(36 mapped from explicit "## Effort" bodies; 3 defaulted to M for issues
without parseable effort).

## What stays UI-only

- The deprecated "Phase" project field still exists with v0..v4 data on
  issues — operator can delete in UI when ready.
- The deprecated "Estimate" project field (duplicate of GitHub's built-in
  Size) still exists — same UI-cleanup-later.
@hanwencheng
Copy link
Copy Markdown
Member Author

Closing in favor of work landing on claude/hopeful-mccarthy-15e5ba — the user requested moving the changes there alongside the demo plan work, and refactored the scope (removed the label-sync workflow, removed the drift-audit workflow, removed the Blocked-by TEXT field in favor of GitHub's native issue relationships).

hanwencheng added a commit that referenced this pull request May 24, 2026
…#130)

* docs(research): AI hardware companion wedge + office-hours design doc

Add two business research artifacts under docs/research/:

- ai-hardware-companion-wedge.md (round 1+2): market sizing, competitive
  landscape, direct competitors, business model critique, 12 critical
  comments, naming, Stripe ACP / Alipay+ AMP integration path, WeChat
  feasibility, security-first demo storyboard.
- ai-hardware-companion-office-hours.md: YC-style office-hours
  diagnostic on the same wedge. Six forcing questions surfaced zero
  vendor conversations + no named buyer. P2 narrowed mid-session to
  memory portability + isolation + privacy. Approach D chosen:
  AgentKeys-native hosted sandbox (aiosandbox) with OpenClaw/Hermes
  agent runtime + per-actor isolation (issue #90) + cross-vendor
  memory consent model. Pricing pivoted to AWS-style elastic
  per-user (Free / Basic vendor-paid $2-3/active-device / Pro $10
  user-paid with 30% lifetime acquirer revshare / future Compute
  usage-based). 8/10 quality after 2 spec-review iterations.

Both index entries added to docs/research/README.md.

* docs(plan): issue #102 — aiosandbox + Hermes + AgentKeys ESP32 demo plan

End-to-end demo plan for the AgentKeys hardware-vendor wedge:
ESP32 device + simple URL config → agent-infra/sandbox running
Hermes (AgentKeys-native runtime) + agentkeys-daemon with mock
memory injected from S3 MD blob at agent boot.

12-step implementation order. Reuses arch.md canonical primitives
(sandbox runtime, supervisord lifecycle, memory bucket layout
bots/<actor_omni>/memory/, agentkeys-daemon). v0 scope: single
ESP32, single sandbox, single mock memory blob, text-mode chat.

Voice mode, multi-tenancy, cap-token enforcement, cross-vendor
portability, and payment rails are deferred to follow-up issues.

3-week effort estimate. Acceptance: reviewer can flash board + run
setup script + see personalized response within 15 minutes.

* issue #103: ESP32-S3 firmware foundation + plan rename

Pivot canonical demo target from generic ESP32 to ESP32-S3-DevKitC-1:
- Native USB-OTG (single USB-C, no separate UART chip)
- PSRAM (8MB octal) for voice follow-up audio buffers
- Xtensa LX7 with AI vector instructions for on-device wake-word
- Still MCU-class authenticity (~$10-15 dev board, <$5 chip in BOM volume)

Stack: PlatformIO + ESP-IDF (not Arduino) — production AI-toy vendors
use ESP-IDF and S3-specific features (native USB CDC, PSRAM, ESP-DSP,
secure boot, OTA) need IDF.

Scaffolded firmware foundation under firmware/esp32s3-agentkeys/:
- platformio.ini, CMakeLists.txt, sdkconfig.defaults, partitions.csv
- main.c spawns 4 FreeRTOS tasks (wifi/button/chat/led) coordinated
  via event group + queue
- wifi_sta.c: working STA mode + auto-reconnect
- button.c: working GPIO interrupt + 200ms debounce on BOOT (GPIO 0)
- led_status.c: stub blinker (real WS2812 RGB state machine is TODO)
- https_chat.c: stub echoing user input (real esp_http_client POST is TODO)
- config.h: NVS → secrets.h → hardcoded defaults priority order
- README.md: flash quickstart + troubleshooting

Foundation builds + flashes + boots into FreeRTOS loop today; chat
returns mock '[mock] you said: ...' echo. Real HTTPS POST is the
clear next step (esp_http_client + cJSON parse, ~100 lines).

Renamed plan file issue-102 → issue-103 to match actual issue number.

* research(xiaozhi): identify hardware as MagicLick 2.5 + pivot to Option 1

Hardware on hand confirmed via the device display showing 'magiclink
2p5/1.9.4': MagicLick 2.5 running xiaozhi-esp32 v1.9.4 firmware.

xiaozhi-esp32 (github.com/78/xiaozhi-esp32, MIT, 26K stars) is the
dominant Chinese open-source AI voice firmware for ESP32. Supports
70+ boards including ours. Full streaming voice pipeline already
shipping: offline wake-word (ESP-SR) → ASR → LLM → TTS → OPUS over
WebSocket or MQTT+UDP. MCP-based device + cloud control.

MagicLick 2.5 hardware specs reconstructed from
boards/magiclick-2p5/config.h + board.cc:
- ESP32-S3 chip
- ES8311 audio codec (full-duplex I2S, 24kHz)
- 128x128 GC9107 SPI LCD with emoji rendering
- 3 buttons (main GPIO 21, left GPIO 0, right GPIO 47)
- 2 WS2812 LEDs on GPIO 38
- DualNetworkBoard: WiFi primary + ML307 Cat.1 4G fallback
- Battery + power manager with tickless idle

'Hermes agent' clarified to mean NousResearch/hermes-agent (MIT,
Python, self-improving learning loop, multi-interface gateway,
LLM-agnostic). NOT an internal AgentKeys runtime as the original
plan §C4 mistakenly stated.

Strong recommendation: Option 1 — keep xiaozhi firmware unchanged,
build cloud-side xiaozhi-hermes-bridge that speaks the xiaozhi
WebSocket protocol while routing the agent loop to Hermes-agent
(which pulls memory from agentkeys-daemon per §C3). Reduces v0
effort from ~3 months (custom firmware) to ~2-3 weeks (server-side
adapter only). Forks from one of four existing reference server
implementations (Python xinnan-tech, Go hackers365 with openclaw,
Java joey-zhou, Go AnimeAIChat).

Hardware verification: 5 paths documented (visual / ROM bootloader
via boot button hold / WiFi captive portal / vendor app / disassembly).
USB doesn't enumerate by default because device is in normal firmware
mode; hold LEFT button while connecting USB to drop into ESP32-S3
ROM bootloader for esptool access.

Added PIVOT banner at top of issue-103 plan flagging that C4/C5/C6
are superseded. Full new direction in
docs/research/xiaozhi-esp32-magiclink.md.

firmware/esp32s3-agentkeys/ stays in tree as reference scaffolding
for future custom hardware (new product lines that need first-party
firmware), not the path for the MagicLick demo.

* research(xiaozhi-hermes): architecture diagrams + risk verification

Two new research docs supporting the issue #103 Option 1 direction:

docs/research/xiaozhi-hermes-architecture.md
  Permanent architecture reference with three ASCII diagrams:
  - Diagram A: baseline xiaozhi flow (device → cloud → LLM)
  - Diagram B: our pivoted flow with changed layers highlighted
    (UNCHANGED firmware, NEW URL only on device side, fork +
    one-module-rewrite on cloud side, new memory layer)
  - Diagram C: per-turn sequence with latency budget breakdown
    (~2.0-2.5s first-audio; ~+250-500ms delta vs baseline)
  Precise diff table: 13 layers compared, only 4 actually change,
  3 of those are NEW additions (not modifications). The actual code
  change is concentrated in ONE module of the bridge fork.

docs/research/xiaozhi-hermes-risks.md
  Risk verification grounded in actual Hermes-agent +
  xinnan-tech/xiaozhi-esp32-server source code, NOT assumptions.
  Specific file paths + line numbers cited throughout.

  R1 (Hermes HTTP gateway stateless-vs-session): REAL but
  mitigation is built-in. Gateway exposes /v1/chat/completions
  with three session modes (stateless per-call default, explicit
  continuation via X-Hermes-Session-Id, long-term memory scoping
  via X-Hermes-Session-Key). Bridge sets per-device session keys.
  Effort: 2-4 hours.

  R2 (Latency stack): mostly NOT real. agent/conversation_loop.py
  line 4152 confirms learning loop runs as background task AFTER
  response delivery, OFF the turn path. With enabled_toolsets=[]
  + max_iterations=1 + streaming SSE, overhead is ~50-200ms.
  xiaozhi-performance-research baselines:
  - ASR: 0.795s Xunfei / 0.85s Doubao
  - LLM first-token: 0.434s Qwen-Flash / 0.774s Kimi-K2
  - TTS: 0.488s CosyVoice / 0.667s Edge-TTS / 0.103s PaddleSpeech
  Pipelined: 1.4-2.4s first-audio, within 2.0-2.5s target.
  Effort: 1 day (tune + measure).

  R3 (Concurrent device handling): less bad than feared. Hermes
  gateway IS multi-tenant by design (serves Telegram + Discord +
  Slack + WhatsApp + Signal + CLI from one process). Per-request
  memory ~20-80MB; 100 devices ~2-8GB on one VPS. xiaozhi-esp32-
  server's documented '100+ devices per process' claim is
  unverified in repo — only 6-concurrent demo documented. For v0:
  0 hours. For production scale: 1-2 weeks sticky-LB.

  R4 (newly discovered during research): cold agent construction
  per request adds 50-300ms on every turn. _create_agent() called
  inside _handle_chat_completions for EVERY request, no pooling.
  Most impactful for voice UX (compounds turn-by-turn).
  Mitigation: fork-local agent pool (1 day) or upstream patch
  (2-4 days).

  Net effect: v0 timeline revised from ~3 weeks to ~1-2 weeks.

Updated docs/research/README.md to index both new docs.

* research(tuya) + revise v0 timeline ~3w → ~1-2w + fix unverified claim

Three updates following the risk-verification research:

1. docs/research/tuya-vs-xiaozhi.md (new)
   Answers 'is Tuya the same role as xiaozhi?': DIFFERENT role,
   partial firmware overlap. Tuya = closed PaaS for brand-owners
   (NYSE: TUYA, $80.9M Q1 2026 revenue, 306 premium customers,
   1.97M developers, 100+ countries). xiaozhi = open firmware
   for makers (MIT, 26.7K stars). TuyaOpen is a 1.6K-star
   defensive ESP32 SDK from Jan 2026 — 17x adoption gap.

   AgentKeys posture: complement both, never compete.
   - Phase 1 (now): xiaozhi cloud-side bridge (issue #103)
   - Phase 2 (3-6 mo): Tuya Cloud Development connector
   - Sit above both rails (same pattern as Alipay+ AMP / Stripe ACP)

2. v0 demo timeline revised from ~3 weeks to ~1-2 weeks
   in issue-103-aiosandbox-hermes-esp32-demo.md:
   - PIVOT banner at top of plan
   - Effort estimate section (line 441)
   The basis is xiaozhi-hermes-risks.md showing all four risks
   are smaller than originally feared (R1 built-in mitigation,
   R2 background loop, R3 multi-tenant by design, R4 cheap
   fork-local hack).

3. Fixed false cross-reference in xiaozhi-hermes-risks.md
   The 'unverified 100+ devices' claim was incorrectly
   attributed to the office-hours doc. It actually circulated
   in earlier informal discussion — not in any committed doc.
   Reworded to remove the false attribution.

4. Added implementation update banner to office-hours doc
   pointing readers at the four xiaozhi research docs + the
   revised v0 timeline. The §Recommended Approach / Pricing /
   Cross-Vendor Memory Model below stay unchanged — only the
   firmware-and-runtime layer shifted.

* research(tuya): verify Phase 3 IoT cloud adapter feasibility per-platform

Earlier version of tuya-vs-xiaozhi.md claimed Phase 3 would add
adapters for Xiaomi MIoT, Alibaba Smart Home, and Volcano AI Hub
without verifying each platform's third-party developer surface.
Research findings per platform:

Volcano Ark (ByteDance) — VERIFIED FEASIBLE
- Open international developer signup, no PRC entity / ICP needed
- MCP-server marketplace launched 2026 (mcp.so/server/mcp-server/volcengine)
- AgentKeys publishes an MCP tool any Doubao-powered AI hardware can call
- Genuinely Tuya-equivalent for the AI-side rather than IoT-side
- ~1 week effort

AliGenie / Tmall Genie (Alibaba) — FEASIBLE WITH PARTNERSHIP
- International Alibaba Cloud account works for sandbox + custom-skill webhook
- Production distribution onto Tmall Genie hardware requires Alibaba's
  skill review + de-facto PRC-domiciled brand
- ~1 week dev + partnership lead time

Xiaomi MIoT / XiaoAI — WEAKEST
- Brand-tier integration requires Mi Ecosystem partnership admission
- Publishable XiaoAI skills require PRC real-name verification
- Consumer-OAuth path (Home-Assistant-style) works today for foreign
  servers but is a narrower wedge than brand-tier
- Defer until partnership or scope to consumer-OAuth only

Rewrote Phase 3 section to split into 3a (Volcano open), 3b
(AliGenie with partner), 3c (Xiaomi deferred). Added explicit
'Honest note on Phase 3 verification' acknowledging the original
claim was hand-wavy. Added 15 source URLs to the Sources block.

* research(volcano-ark): MCP-server integration architecture + diagrams

New research doc with three ASCII diagrams showing how AgentKeys
integrates with Volcano Ark (ByteDance's enterprise AI cloud
hosting Doubao LLM) as a Phase 3a hosted MCP server registered in
their 2026 MCP marketplace.

Pattern B (hosted by us, marketplace is discovery only):
- AgentKeys MCP server at mcp.agentkeys.io exposes 5-7 tools
  (memory get/put, cred fetch, cap mint, audit append, whoami,
  permission check) mapped to existing Stage 7+ backend RPCs
- Vendor Doubao agents call our MCP tools via HTTPS/SSE with
  per-vendor Bearer token + per-actor X-AgentKeys-Actor header
- No vendor firmware changes; no Doubao runtime changes — just
  marketplace registration + one-checkbox vendor opt-in

Diagram A: high-level architecture (device → RTC → Doubao →
  MCP → AgentKeys MCP server → backend)
Diagram B: per-call MCP tool sequence with ~200-400ms per-call
  latency budget (concern noted: multiple tool calls per turn
  can stack — mitigation via batched 'context.bootstrap' tool)
Diagram C: cross-vendor composition showing same user (O_kevin)
  with FoloToy (Doubao + MCP adapter) AND MagicLick (xiaozhi +
  Hermes bridge) both terminating at one AgentKeys backend with
  one memory namespace + one identity tree + one audit ledger.
  This is the cross-vendor portability moat materializing
  automatically per office-hours doc §Cross-Vendor Memory Model.

Effort: ~1-1.5 weeks (sibling to xiaozhi-hermes-bridge).

6 open risks called out + mitigations sketched:
- MCP latency stacking per turn
- Marketplace approval SLA
- Per-tenant auth model TBD
- Actor omni resolution pattern (vendor-side vs whoami call)
- MCP protocol version compat with Doubao runtime
- Cross-vendor cap-token consent (resolved: same office-hours
  consent ceremony applies)

Updated docs/research/README.md to index the new doc.

* strategy: Agent IAM positioning + 4 architecture corrections

New strategic anchor doc at docs/research/agent-iam-strategy.md
captures the revised direction from multi-round discussion
(original Agent IAM proposal → independent analysis → ChatGPT
critique → synthesis).

Three-layer positioning, three audiences:
- AI Device Account (consumer/vendor BD pitch)
- Agent IAM (B2B/investor/CTO category)
- Trust Substrate (compliance/regulator/Web3 partner)

Five accepted strategic moves:
- Task Host vs Authority Host distinction (we are Authority)
- Agent IAM as the technical category (not key management / not
  memory MCP)
- MCP as integration surface, not product identity
- Zero orchestration in v1 — hard line
- Deploy → grow → standardize sequencing

Four architecture corrections that tighten commitments:

1. Revocation: 'immediate online, bounded TTL/cache offline'
   (NOT 'no propagation delay'). High-risk actions always
   online; low-risk reads use short-lived cached caps; offline
   mode denies sensitive actions by default.

2. Audit (two-tier): real-time off-chain feed in parent-control
   UI + 10-min batched Merkle root anchored to Heima. NOT
   real-time on-chain. Heima explorer is tamper-evidence proof,
   not the UX surface.

3. Delegation: agentkeys.delegation.grant is schema-documented
   but not active in v1. Returns not_implemented_in_v1. Active
   delegation lands in Phase 4.

4. Dual narrative — don't lead with 'Agent IAM' in consumer
   contexts; don't lead with 'memory portability' anywhere.
   Authority is the category; privacy/memory are benefits.

Phase 1 revised to three-act IAM demo (per office-hours doc
§9.6 storyboard, now elevated to authoritative spec):
- Act 1 Permissioned Memory (scoped read, not 'smart')
- Act 2 Deterministic Denial (policy decides, no LLM)
- Act 3 Online Revocation (parent UI → device denies)

Implementation note: cap-token machinery is already shipped via
Stage 7+ (broker, signer, K3/K10 HDKD, memory/cred/audit workers,
per-actor isolation per issue #90). New Phase 1 work is the
MCP server wrapper (~1 week), parent-control web UI (~3-4 days),
two-tier audit wiring (~1 day), runbook (~half day). Total ~2 weeks.

12-month roadmap revised:
- Phase 0: shipped (Stage 7+)
- Phase 1 (0-2 wk): Agent IAM v0 demo
- Phase 2 (1-2 mo): vendor pilot + multi-rail (Volcano Ark, Tuya)
- Phase 3 (3-4 mo): runtime neutrality (Hermes/OpenClaw as MCP tools)
- Phase 4 (6 mo): delegation + approval + ACL depth
- Phase 5 (post-12mo): standards engagement (contingent on traction)

Updates to existing docs:
- docs/research/README.md: indexed new strategy doc as 'Strategic anchor'
- ai-hardware-companion-office-hours.md: positioning note pivoted from
  'implementation update' to 'strategic update' pointing at strategy doc
- issue-103 plan: PIVOT banner expanded with three-act demo + four
  corrections; old §C4/C5/C6 marked superseded; cap-token shipped
  context made explicit; no implementation re-spec per user direction

* strategy(nits): chain-agnostic positioning + 2-min batch + memory namespace model

Three nits from review:

1. Generic chain instead of Heima-specific positioning
   The strategy doc shouldn't be Heima-locked — chain is a deployment
   config (arch.md describes 'Litentry parachain (or EVM L2 fallback)'
   so the design is already chain-agnostic at the contract layer).
   Updated all positioning text to 'audit chain' / 'on-chain' /
   'chain explorer' instead of Heima-specific. Kept arch.md and
   runbook refs to Heima where they describe actual deployed infra
   (the 'currently Heima per arch.md, swappable' note in §Phase 0
   captures the reality without committing the strategy to Heima).

2. 2-min batch instead of 10-min
   Modern fast-finality chains with cheap gas make sub-block-time
   batching viable. 10 min was too conservative — set 2 min as the
   default cadence. Faster batch = better UX for parents watching
   audit feed; the cost per anchor is sub-cent at typical batch sizes.

3. Memory namespace model (new §3.5)
   Read the memory research/design doc from main (commit 53ccc9f
   'docs: AI memory worker design plan + agent-memory research survey').
   It defines four STRUCTURAL types (profile / procedural / semantic /
   episodic) with specific S3 key derivation per type.

   For Agent IAM, namespaces are an ORTHOGONAL semantic dimension
   that composes with the 4 structural types. Memory item has BOTH
   a structural type AND a semantic namespace. Cap-tokens scope
   namespace access (namespaces_allowed claim, deterministic
   string-set membership check).

   v0 defaults: personal / family / work / travel (4 namespaces).
   kids/device/temp deferred to Phase 3-4.

   Composition is non-conflicting: namespaces live in wire-format
   metadata, NOT in the S3 key derivation. Memory worker filters
   at retrieval. The 4-type S3 layout from memory-design §3.2a is
   preserved exactly. Future evolution path documented (path-prefixed
   layout if scale demands).

   arch.md compatibility check: zero contradictions found.
   - Memory data_class binding (§17.5) unchanged
   - Per-actor PrincipalTag isolation (§17) unchanged
   - Cap-token format extensible (namespaces_allowed is additive)
   - Memory worker never calls LLM invariant preserved
   - K3 epoch rotation unchanged
   - Architecture-as-source-of-truth: future arch.md §17 + memory-
     design §3 get additive paragraphs when v0 ships, no canonical-
     name conflicts introduced.

Files updated:
- docs/research/agent-iam-strategy.md: §3.2 audit (2-min + chain-
  agnostic), §3.5 NEW memory namespace model with arch.md compat
  check, Phase 0 line (Heima → 'currently Heima per arch.md,
  swappable')
- docs/research/README.md: strategy doc summary updated with 2-min
  + namespace model
- docs/research/ai-hardware-companion-office-hours.md: implementation
  update banner reflects 2-min on-chain anchor
- docs/research/volcano-ark-mcp-integration.md: diagram boxes
  generic ('AWS S3, audit chain', 'off-chain + chain')
- docs/spec/plans/issue-103-aiosandbox-hermes-esp32-demo.md:
  PIVOT banner reflects 2-min chain-agnostic anchor; NOT-in-scope
  list generic 'on-chain audit anchoring'

* pm: declarative milestones + labels + issue automation + dashboard guide

New pm/ subfolder for GitHub project management automation. Treats
milestones / labels / issue categorization as code under version
control with idempotent shell scripts that reconcile GitHub state
to declarative JSON.

Files:
- pm/README.md — folder purpose + how to use
- pm/milestones.json — 7 roadmap milestones (M1-M7) source of truth
- pm/labels.json — 40-label taxonomy: area/ kind/ phase/ status/
  priority/ + extras (needs-arch-review, vendor-blocker)
- pm/issue-assignments.json — categorization of all 23 pre-existing
  open issues with milestone + labels + notes
- pm/new-issues.json — 20 new Phase 1-7 issues to create
- pm/arch-md-verification-report.md — #5/#6/#9/#37 verification
- pm/PROJECT-DASHBOARD-GUIDE.md — how to use projects/19 board +
  CI integration patterns
- pm/scripts/sync-milestones.sh — idempotent: creates/updates from
  milestones.json
- pm/scripts/sync-labels.sh — idempotent: creates/updates from
  labels.json
- pm/scripts/sync-issues.sh — idempotent: assigns milestone+labels
  to each issue in issue-assignments.json
- pm/scripts/create-issues.sh — idempotent: creates new issues from
  new-issues.json, skips if title already exists
- pm/scripts/audit.sh — read-only: groups open issues by milestone,
  flags uncategorized + missing area/* labels
- pm/scripts/add-to-project.sh — adds issues to litentry/projects/19
  (requires gh auth refresh -s project,read:project)

Executed in this session:
- Created 7 milestones (M1: First MCP demo + Volcano Ark PoC, M2:
  First vendor wedge, M3: Runtime neutrality, M4: Capability +
  revocation depth, M5: Native mobile + biometric, M6: TEE
  integration + security, M7: Standards + ecosystem)
- Created 40 labels across 5 namespaces (area, kind, phase,
  status, priority) + extras (needs-arch-review, vendor-blocker)
- Categorized 23 pre-existing open issues with milestones + labels
- Created 20 new issues (#107-#126) for Phase 1-7 work per the
  agent-iam-strategy.md roadmap
- Verified #5, #6, #9, #37 against arch.md — verdicts: #5 partially
  aligned (closed; lives as tier A in §15.3), #6 needs design
  refresh against current K11+SidecarRegistry, #9 already
  implemented as K3 HDKD per §6.2 (recommend close), #37 superseded
  by K11 WebAuthn per §K11 (recommend close)

Final state: 43 open issues, 100% categorized to milestones, 100%
labeled with area/*. No uncategorized issues.

Per user direction: did NOT merge / close #5/#6/#9/#37 even though
recommendations are clear. User to make final close decisions.

* pm: fix bash 3.2 portability + add setup-project-fields.sh + labels-vs-fields strategy

Three fixes responding to user feedback:

1. add-to-project.sh: replace mapfile (bash 4+) with while-read loop
   for macOS bash 3.2 portability per CLAUDE.md project standard.
   Verified working: 'bash pm/scripts/add-to-project.sh 103' now
   successfully adds the issue to litentry/projects/19.

2. NEW pm/scripts/setup-project-fields.sh: creates the canonical
   project-level fields (Priority, Phase, Estimate, Iteration, Risk,
   Notes) via gh project field-create. Solves the 'cluttered Labels
   column' UX pain by letting the user split single-value PM
   concerns (priority, phase, status) out of the multi-value labels
   pile into typed field columns.

3. PROJECT-DASHBOARD-GUIDE.md: added 'Labels vs Fields — when to
   use which' section explaining the split:
   - Labels (repo-level, multi-value): area/*, kind/*, semantic
     flags like needs-arch-review, vendor-blocker
   - Fields (project-level, single-value): Priority, Phase, Status,
     Estimate, Risk
   Plus step-by-step instructions to migrate the cluttered Labels
   column to clean field-based grouping.

These don't change the strategic plan; they just fix the operational
PM-board ergonomics the user surfaced from running the script live.

* pm: workflow-first PM guidance + mark add-to-project.sh as backfill

User pointed out the project board has 10 built-in workflows that
replace much of what the scripts do. Updated guidance to prefer
workflows; scripts are fallback/batch tools.

PROJECT-DASHBOARD-GUIDE.md updates:
- Replaced the brief 'Recommended workflows' section with a full
  table of the 10 built-in workflows + their default state + what
  to configure
- New 'Script ↔ workflow split' table making clear which jobs use
  workflows vs scripts (workflows for runtime project events; scripts
  for repo-level state, batch creation, field definitions)
- One-time workflow configuration checklist (3 steps to get the
  Auto-add filter set, verify other green workflows, optionally
  enable Auto-archive)

add-to-project.sh updates:
- Header now flags this as PRIMARILY A BACKFILL / FALLBACK TOOL
- Lists three legit use cases: backfilling pre-existing issues,
  fallback when Auto-add workflow is misconfigured, adding from
  a different repo via PM_REPO override
- Pointer to PROJECT-DASHBOARD-GUIDE.md for workflow setup

No script behavior changes; only documentation tightens to match
the workflow-first reality.

* pm: programmatic workflow audit (names + enabled state; filter/action stay manual)

User asked if workflows can be programmatically checked. Partial yes:
GitHub's public GraphQL ProjectV2Workflow type exposes only:
  id, name, number, enabled, createdAt, updatedAt, project, fullDatabaseId
NOT the filter expression or action configuration (UI-only, not in
the public API).

So we get:
  ✅ 'is the workflow enabled' check
  ❌ 'does the workflow do the right thing' check (filter/action body)

New files:
- pm/expected-workflows.json: declarative source of truth for what
  workflows should be enabled + what each one's filter/action should
  do (free-text 'verify_in_ui' field that engineers cross-check
  against the UI)
- pm/scripts/check-workflows.sh: audits live workflows on
  litentry/projects/19 vs expected-workflows.json
  - Confirms enabled state matches
  - Flags unexpected workflows that exist but aren't in our list
  - Prints all per-workflow expected filter/action notes for
    manual UI verification
  - Exits 0 when all expectations match, 1 on mismatch (CI-friendly)

Live audit result (verified on litentry/projects/19): 7 expected
workflows enabled (Auto-add to project, Auto-add sub-issues to
project, Item added/closed, Auto-close issue, PR linked/merged),
4 optional workflows correctly disabled (Auto-archive, Code review
approved, Code changes requested, Item reopened). 11/11 match.

This script can be wired into a future CI workflow to alert on
drift if anyone disables Auto-add to project or similar.

* pm: automate project field sync + workflow drift audit via GH Actions

Adds two GitHub Actions and one supporting script to push project automation
to its API ceiling. After this change, label-to-field sync and workflow drift
detection both run on every event / daily schedule instead of as manual scripts.

What landed:

- .github/workflows/pm-sync-fields-from-labels.yml: triggers on issues
  labeled/unlabeled/opened/transferred. Calls sync-fields-from-labels.sh
  to mirror priority/p* + phase/v* labels into the project's Priority + Phase
  single-select fields. workflow_dispatch variant for backfill.

- .github/workflows/pm-workflow-audit.yml: daily cron + push trigger.
  Runs check-workflows.sh against expected-workflows.json and opens (or
  comments on) a tracking issue when drift is detected.

- pm/scripts/sync-fields-from-labels.sh: backing script for the sync workflow.
  Forgiving mode (warns + skips when a field is missing rather than aborting),
  bash 3.2 portable, uses -f for option-ID strings to avoid gh api numeric
  coercion.

- pm/scripts/setup-project-fields.sh: now detects + rebuilds empty-placeholder
  single-select fields (GitHub's built-in Priority/Size ship with zero options)
  and cleans up "Project <Name>" zombie fields left behind when
  deleteProjectV2Field renames instead of deleting system-reserved names.
  Fully idempotent.

- pm/PROJECT-DASHBOARD-GUIDE.md: new "What's automated vs UI-only" verdict
  table (built-in workflow filter/action contents + custom views are 100%
  UI-only — no API mutation exists for either). New "Known gotcha" section
  on Priority-field zombies. Script-vs-workflow split rewritten as three-tier
  matrix (built-in / our GH Action / bash script).

Verification: tested live against litentry/projects/19. Backfilled 40+
issues onto board, synced Priority + Phase from labels on every one, zero
zombie fields remain. setup-project-fields.sh second-run shows all skips.

API ceiling discovered via GraphQL introspection: ProjectV2Workflow has
no create/update mutation (only delete). ProjectV2View has no create/update
mutation at all. Both are read-only via API, UI-only to configure.

Required repo secret for CI: PM_PROJECT_TOKEN (fine-grained PAT with
Projects=read+write, Issues=read+write). Documented in dashboard guide.

* pm: simplify automation — drop audit + label-sync workflows, use GitHub native

User feedback after live use of the migration:
- The label→field sync workflow is no longer needed (labels were deleted in
  PR #129; fields are now the source of truth, set via the issue-create skill
  or manually in UI).
- The workflow-drift audit workflow added noise without value (built-in
  workflows rarely drift, and the operator manages them in UI anyway).
- The Blocked-by TEXT project field duplicates GitHub's native issue
  relationships ("Mark as blocked by" / "Mark as blocking" in the UI side
  panel, keyboard `B B` / `B X`). Use the native feature.

## Removed

- .github/workflows/pm-workflow-audit.yml (drift detection — operator handles in UI)
- .github/workflows/pm-sync-fields-from-labels.yml (labels-to-fields sync — labels are gone)
- pm/expected-workflows.json (declarative expectation for the audit)
- pm/scripts/check-workflows.sh (called by the audit)
- pm/scripts/sync-fields-from-labels.sh (called by the sync workflow)
- "Blocked by" project field (deleted via API; setup-project-fields.sh no longer creates it)

## Kept / added

- .github/workflows/pm-auto-archive-closed-pr.yml — auto-archives PRs from the
  board on close (built-in Auto-archive only fires after 30 days)
- pm/scripts/sync-size-from-effort.sh (NEW) — one-shot bulk-populate of the
  Size project field by parsing each issue's "## Effort" body section.
  Idempotent (skips already-sized items). Defaults to M when no parseable
  effort line found.
- ~/.claude/skills/agentkeys-issue-create — updated to:
  - Set Kind/Priority/Size project fields directly via API (replaces deleted
    label-sync workflow)
  - Use GitHub native relationships for blocked-by (replaces removed field)

## Live state after this change

39 open issues all have complete Kind + Priority + Size field values
(36 mapped from explicit "## Effort" bodies; 3 defaulted to M for issues
without parseable effort).

## What stays UI-only

- The deprecated "Phase" project field still exists with v0..v4 data on
  issues — operator can delete in UI when ready.
- The deprecated "Estimate" project field (duplicate of GitHub's built-in
  Size) still exists — same UI-cleanup-later.

* docs: archive v1/v2 staging docs + add M1-M7 milestone roadmap

The v1/v2 staged plan framing retires after v2-stage3 ships green. Going
forward, milestone-level work (M1-M7) is tracked against the new
docs/spec/plans/milestones-roadmap.md — the operational companion to
agent-iam-strategy.md.

## Archived (moved to docs/archived/ with _2026-04 suffix)

- docs/stage7-demo-and-verification.md (123KB, the big stage-7 end-to-end demo doc)
- docs/operator-runbook-stage7.md (39KB, supplanted by scripts/setup-broker-host.sh)
- docs/stage8-wip.md (15KB, off-chain vault design now in arch.md + threat-model)
- docs/spec/plans/development-stages.md (the 8-stage v2 plan, replaced by milestones-roadmap.md)

Per CLAUDE.md docs policy: archive, never delete; archived files are never
read in normal dev.

## Added

- docs/spec/plans/milestones-roadmap.md — M1-M7 detail + post-M7 horizons
  + strategic risks table + how-to-use-this-doc. Cross-references arch.md
  for invariants and agent-iam-strategy.md for positioning. This becomes
  the authoritative milestone plan from M1 onward.

## Cross-refs updated (active docs only)

- docs/arch.md: §24 + §25 cross-refs now point at scripts/setup-broker-host.sh
  (canonical idempotent runbook) + archived stage-7 commentary for history
- docs/dev-setup.md: 5 stage7/dev-stages refs → setup-broker-host.sh +
  milestones-roadmap.md
- docs/v2-stage1-migration-and-demo.md: 4 stage7 refs → archive locations +
  status banner noting v1/v2 retirement after v2-stage3
- CLAUDE.md: 3 refs (build plan, runbook policy, harness workflow) →
  milestones-roadmap.md
- docs/spec/{threat-model-key-custody,ses-email-architecture,credential-backend-interface}.md:
  stage8-wip refs → archive
- docs/spec/heima-gaps-vs-desired-architecture.md: stage7 demo §4 → archive
- docs/wiki/upstream-backend-classes-exercise-vs-distribution.md: stage7
  demo refs → archive (wiki auto-publishes to GitHub Wiki via publish-wiki.yml)

## What's NOT updated (intentional)

Issue-specific plan files under docs/spec/plans/issue-64/ + issue-74-* +
issue-credential-storage-* still reference the archived docs by name.
These are themselves historical issue-deliverable records; the references
are timestamped artifacts of when those issues were planned, not active
operational links. They stay as-is.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant